Search CORE

98 research outputs found

VST++: Efficient and Stronger Visual Saliency Transformer

Author: Han Junwei
Liu Nian
Luo Ziyang
Zhang Ni
Publication venue
Publication date: 18/10/2023
Field of study

While previous CNN-based models have exhibited promising results for salient object detection (SOD), their ability to explore global long-range dependencies is restricted. Our previous work, the Visual Saliency Transformer (VST), addressed this constraint from a transformer-based sequence-to-sequence perspective, to unify RGB and RGB-D SOD. In VST, we developed a multi-task transformer decoder that concurrently predicts saliency and boundary outcomes in a pure transformer architecture. Moreover, we introduced a novel token upsampling method called reverse T2T for predicting a high-resolution saliency map effortlessly within transformer-based structures. Building upon the VST model, we further propose an efficient and stronger VST version in this work, i.e. VST++. To mitigate the computational costs of the VST model, we propose a Select-Integrate Attention (SIA) module, partitioning foreground into fine-grained segments and aggregating background information into a single coarse-grained token. To incorporate 3D depth information with low cost, we design a novel depth position encoding method tailored for depth maps. Furthermore, we introduce a token-supervised prediction loss to provide straightforward guidance for the task-related tokens. We evaluate our VST++ model across various transformer-based backbones on RGB, RGB-D, and RGB-T SOD benchmark datasets. Experimental results show that our model outperforms existing methods while achieving a 25% reduction in computational costs without significant performance compromise. The demonstrated strong ability for generalization, enhanced performance, and heightened efficiency of our VST++ model highlight its potential

arXiv.org e-Print Archive

SAMN: A Sample Attention Memory Network Combining SVM and NN in One Architecture

Author: Chen Ziyang
Luo Linkai
Peng Hong
Yang Qiaoling
Zhang Haoyu
Publication venue
Publication date: 25/09/2023
Field of study

Support vector machine (SVM) and neural networks (NN) have strong complementarity. SVM focuses on the inner operation among samples while NN focuses on the operation among the features within samples. Thus, it is promising and attractive to combine SVM and NN, as it may provide a more powerful function than SVM or NN alone. However, current work on combining them lacks true integration. To address this, we propose a sample attention memory network (SAMN) that effectively combines SVM and NN by incorporating sample attention module, class prototypes, and memory block to NN. SVM can be viewed as a sample attention machine. It allows us to add a sample attention module to NN to implement the main function of SVM. Class prototypes are representatives of all classes, which can be viewed as alternatives to support vectors. The memory block is used for the storage and update of class prototypes. Class prototypes and memory block effectively reduce the computational cost of sample attention and make SAMN suitable for multi-classification tasks. Extensive experiments show that SAMN achieves better classification performance than single SVM or single NN with similar parameter sizes, as well as the previous best model for combining SVM and NN. The sample attention mechanism is a flexible module that can be easily deepened and incorporated into neural networks that require it

arXiv.org e-Print Archive

MaxMin-L2-SVC-NCH: A New Method to Train Support Vector Classifier with the Selection of Model's Parameters

Author: Chen Ziyang
Luo Linkai
Peng Hong
Wang Yiding
Yang Qiaoling
Publication venue
Publication date: 14/07/2023
Field of study

The selection of model's parameters plays an important role in the application of support vector classification (SVC). The commonly used method of selecting model's parameters is the k-fold cross validation with grid search (CV). It is extremely time-consuming because it needs to train a large number of SVC models. In this paper, a new method is proposed to train SVC with the selection of model's parameters. Firstly, training SVC with the selection of model's parameters is modeled as a minimax optimization problem (MaxMin-L2-SVC-NCH), in which the minimization problem is an optimization problem of finding the closest points between two normal convex hulls (L2-SVC-NCH) while the maximization problem is an optimization problem of finding the optimal model's parameters. A lower time complexity can be expected in MaxMin-L2-SVC-NCH because CV is abandoned. A gradient-based algorithm is then proposed to solve MaxMin-L2-SVC-NCH, in which L2-SVC-NCH is solved by a projected gradient algorithm (PGA) while the maximization problem is solved by a gradient ascent algorithm with dynamic learning rate. To demonstrate the advantages of the PGA in solving L2-SVC-NCH, we carry out a comparison of the PGA and the famous sequential minimal optimization (SMO) algorithm after a SMO algorithm and some KKT conditions for L2-SVC-NCH are provided. It is revealed that the SMO algorithm is a special case of the PGA. Thus, the PGA can provide more flexibility. The comparative experiments between MaxMin-L2-SVC-NCH and the classical parameter selection models on public datasets show that MaxMin-L2-SVC-NCH greatly reduces the number of models to be trained and the test accuracy is not lost to the classical models. It indicates that MaxMin-L2-SVC-NCH performs better than the other models. We strongly recommend MaxMin-L2-SVC-NCH as a preferred model for SVC task

arXiv.org e-Print Archive

YOLOv5-TS: Detecting traffic signs in real-time

Author: Jiquan Shen
Jiquan Shen
Junwei Luo
Xiaohong Zhang
Ziyang Zhang
Ziyang Zhang
Publication venue: Frontiers Media S.A.
Publication date: 01/11/2023
Field of study

Traffic sign detection plays a vital role in assisted driving and automatic driving. YOLOv5, as a one-stage object detection solution, is very suitable for Traffic sign detection. However, it suffers from the problem of false detection and missed detection of small objects. To address this issue, we have made improvements to YOLOv5 and subsequently introduced YOLOv5-TS in this work. In YOLOv5-TS, a spatial pyramid with depth-wise convolution is proposed by replacing maximum pooling operations in spatial pyramid pooling with depth-wise convolutions. It is applied to the backbone to extract multi-scale features at the same time prevent feature loss. A Multiple Feature Fusion module is proposed to fuse multi-scale feature maps multiple times with the purpose of enhancing both the semantic expression ability and the detail expression ability of feature maps. To improve the accuracy in detecting small even extra small objects, a specialized detection layer is introduced by utilizing the highest-resolution feature map. Besides, a new method based on k-means++ is proposed to generate stable anchor boxes. The experiments on the data set verify the usefulness and effectiveness of our work

Directory of Open Access Journals

Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning

Author: Jiang Haiyun
Lin Hongzhan
Liu Ruifang
Luo Ziyang
Ma Jing
Shi Shuming
Yi Pengyao
Publication venue
Publication date: 29/03/2023
Field of study

The spread of rumors along with breaking events seriously hinders the truth in the era of social media. Previous studies reveal that due to the lack of annotated resources, rumors presented in minority languages are hard to be detected. Furthermore, the unforeseen breaking events not involved in yesterday's news exacerbate the scarcity of data resources. In this work, we propose a novel zero-shot framework based on prompt learning to detect rumors falling in different domains or presented in different languages. More specifically, we firstly represent rumor circulated on social media as diverse propagation threads, then design a hierarchical prompt encoding mechanism to learn language-agnostic contextual representations for both prompts and rumor data. To further enhance domain adaptation, we model the domain-invariant structural features from the propagation threads, to incorporate structural position representations of influential community response. In addition, a new virtual response augmentation method is used to improve model training. Extensive experiments conducted on three real-world datasets demonstrate that our proposed model achieves much better performance than state-of-the-art methods and exhibits a superior capacity for detecting rumors at early stages.Comment: AAAI 202

arXiv.org e-Print Archive

Recommended from our members

Fiber Vector Bend Sensor Based on Multimode Interference and Image Tapping

Author: Fiebrandt Julia
Luo Jiajun
Madhav Kalaga
Rahman Aashia
Roth Martin M.
Sun Kai
Wang Yu
Zhang Ziyang
Publication venue: Basel : MDPI
Publication date: 01/01/2019
Field of study

A grating-less fiber vector bend sensor is demonstrated using a standard single mode fiber spliced to a multimode fiber as a multimode interference device. The ring-shaped light intensity distribution at the end of the multimode fiber is subject to a vector transition in response to the fiber bend. Instead of comprehensive imaging processing for the analysis, the image can be tapped out by a seven-core fiber spliced to the other end of the multimode fiber. The seven-core fiber is further guided to seven single mode fibers via a commercial fan-out device. By comparing the relative light intensities received at the seven outputs, both the bend radius and its direction can be determined. Experiment has shown that a slight bend displacement of 10 µm over a 1.2-cm-long multimode fiber in the X direction (bend angle of 0.382 ◦ ) causes a distinctive power imbalance of 4.6 dB between two chosen outputs (numbered C4 and C7). For the same displacement in the Y direction, the power ratio between the previous two outputs C4 and C7 remains constant, while the imbalance between another pair (C3 and C4) rises significantly to 7.0 dB. © 2019 by the authors. Licensee MDPI, Basel, Switzerland

Repositorium für Naturwissenschaften und Technik

Scalable mode division multiplexed transmission over a 10-km ring-core fiber using high-order orbital angular momentum modes

Author: Cai Xinlun
Chen Yujie
Du Cheng
Hu Ziyang
Liu Jie
Luo Wenyong
Wu Xiong
Yu Siyuan
Zhu Guoxuan
Zhu Jiangbo
Publication venue: 'The Optical Society'
Publication date: 22/01/2018
Field of study

We propose and demonstrate a scalable mode division multiplexing scheme based on orbital angular momentum modes in ring core fibers. In this scheme, the high-order mode groups of a ring core fiber are sufficiently de-coupled by the large differential effective refractive index so that multiple-input multiple-output (MIMO) equalization is only used for crosstalk equalization within each mode group. We design and fabricate a graded-index ring core fiber that supports 5 mode groups with low inter-mode-group coupling, small intra-mode-group differential group delay, and small group velocity dispersion slope over the C-band for the high-order mode groups. We implement a two-dimensional wavelength- and mode-division multiplexed transmission experiment involving 10 wavelengths and 2 mode groups each with 4 OAM modes, transmitting 32 GBaud Nyquist QPSK signals over all 80 channels. An aggregate capacity of 5.12 Tb/s and an overall spectral efficiency of 9 bit/s/Hz over 10 km are realized, only using modular 4x4 MIMO processing with 15 taps to recover signals from the intra-mode-group mode coupling. Given the fixed number of modes in each mode group and the low inter-mode-group coupling in ring core fibres, our scheme strikes a balance in the trade-off between system capacity and digital signal processing complexity, and therefore has good potential for capacity upscaling at an expense of only modularly increasing the number of mode-groups with fixed-size (4x4) MIMO blocks

Northumbria Research Link

Explore Bristol Research

The ALMA-QUARKS survey: -- I. Survey description and data reduction

Author: Bronfman Leonardo
Cheng Yu
Dewangan Lokesh
Evans Neal
Garay Guido
Goldsmith Paul
Gu Qilao
He Jinhua
Kim Kee-Tae
Li Shanghuo
Li Ziyang
Liu Hong-Li
Liu Sheng-Yuan
Liu Tie
Liu Xunchuan
Lu Xing
Luo Qiuyi
Mai Xiaofeng
Mardones Diego
Qin Sheng-Li
Saha Anindya
Sanhueza Patricio
Shen Zhiqiang
Stutz Amelia
Tatematsu Ken'ichi
Tej Anandmayee
Wang Ke
Xu Fengwei
Zhang Qizhou
Zhang Siju
Zhang Suinan
Zhang Zhenying
Zhou Jianwen
Zhu Lei
Publication venue
Publication date: 14/11/2023
Field of study

This paper presents an overview of the QUARKS survey, which stands for `Querying Underlying mechanisms of massive star formation with ALMA-Resolved gas Kinematics and Structures'. The QUARKS survey is observing 139 massive clumps covered by 156 pointings at ALMA Band 6 (

\lambda\sim

1.3 mm). In conjunction with data obtained from the ALMA-ATOMS survey at Band 3 (

\lambda\sim

3 mm), QUARKS aims to carry out an unbiased statistical investigation of massive star formation process within protoclusters down to a scale of 1000 au. This overview paper describes the observations and data reduction of the QUARKS survey, and gives a first look at an exemplar source, the mini-starburst Sgr B2(M). The wide-bandwidth (7.5 GHz) and high-angular-resolution (~0.3 arcsec) observations of the QUARKS survey allow to resolve much more compact cores than could be done by the ATOMS survey, and to detect previously unrevealed fainter filamentary structures. The spectral windows cover transitions of species including CO, SO, N

_2

^+

, SiO, H

_{30}\alpha

, H

_2

CO, CH

_3

CN and many other complex organic molecules, tracing gas components with different temperatures and spatial extents. QUARKS aims to deepen our understanding of several scientific topics of massive star formation, such as the mass transport within protoclusters by (hub-)filamentary structures, the existence of massive starless cores, the physical and chemical properties of dense cores within protoclusters, and the feedback from already formed high-mass young protostars.Comment: 9 figures, 4 tables, accepted by RA

arXiv.org e-Print Archive